AdaBoost and Forward Stagewise Regression are First-Order Convex Optimization Methods
نویسندگان
چکیده
Boosting methods are highly popular and effective supervised learning methods which combine weak learners into a single accurate model with good statistical performance. In this paper, we analyze two well-known boosting methods, AdaBoost and Incremental Forward Stagewise Regression (FSε), by establishing their precise connections to the Mirror Descent algorithm, which is a first-order method in convex optimization. As a consequence of these connections we obtain novel computational guarantees for these boosting methods. In particular, we characterize convergence bounds of AdaBoost, related to both the margin and log-exponential loss function, for any step-size sequence. Furthermore, this paper presents, for the first time, precise computational complexity results for FSε.
منابع مشابه
A New Perspective on Boosting in Linear Regression via Subgradient Optimization and Relatives
Boosting [6,9,12,15,16] is an extremely successful and popular supervised learning technique that combines multiple “weak” learners into a more powerful “committee.” AdaBoost [7, 12, 16], developed in the context of classification, is one of the earliest and most influential boosting algorithms. In our paper [5], we analyze boosting algorithms in linear regression [3,8,9] from the perspective o...
متن کاملStagewise Lasso Stagewise Lasso
Many statistical machine learning algorithms (in regression or classification) minimize either an empirical loss function as in AdaBoost, or a penalized empirical loss as in SVM. A single regularization tuning parameter controls the trade-off between fidelity to the data and generalibility, or equivalently between bias and variance. When this tuning parameter changes, a regularization “path” of...
متن کاملA general framework for fast stagewise algorithms
Forward stagewise regression follows a very simple strategy for constructing a sequence of sparse regression estimates: it starts with all coefficients equal to zero, and iteratively updates the coefficient (by a small amount ) of the variable that achieves the maximal absolute inner product with the current residual. This procedure has an interesting connection to the lasso: under some conditi...
متن کاملStagewise Lasso
Many statistical machine learning algorithms minimize either an empirical loss function as in AdaBoost, or a penalized empirical loss as in Lasso or SVM. A single regularization tuning parameter controls the trade-off between fidelity to the data and generalizability, or equivalently between bias and variance. When this tuning parameter changes, a regularization “path” of solutions to the minim...
متن کاملBoosting and Maximum Likelihood for Exponential Models
We derive an equivalence between AdaBoost and the dual of a convex optimization problem, showing that the only difference between minimizing the exponential loss used by AdaBoost and maximum likelihood for exponential models is that the latter requires the model to be normalized to form a conditional probability distribution over labels. In addition to establishing a simple and easily understoo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1307.1192 شماره
صفحات -
تاریخ انتشار 2013